Scalable Techniques for Creating Semantic Vector Representations

نویسندگان

  • Michael N. Jones
  • Gabriel Recchia
چکیده

Current vector space models for representing lexical semantics rely on sophisticated dimensional reduction operations. The complex data reduction step introduces a limitation for these models to scale up to take advantage of much larger text corpora. We examine previous work with scalable metrics, and explore the ability of scalable and incremental random vector accumulation (RVA) techniques to learn semantic representations gracefully from massive corpora. We compare RVAs to scalable metrics (PMI) as well as LSA on standard semantic evaluation tasks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Joint Semantic Vector Representation Model for Text Clustering and Classification

Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, their shortcoming in capturing semantic concepts of text motivated researches to use...

متن کامل

Flexible Similarity Search of Semantic Vectors Using Fulltext Search Engines

Vector representations and vector space modeling (VSM) play a central role in modern machine learning. In our recent research we proposed a novel approach to ‘vector similarity searching’ over dense semantic vector representations. This approach can be deployed on top of traditional inverted-index-based fulltext engines, taking advantage of their robustness, stability, scalability and ubiquity....

متن کامل

Applying the Semantic Web: The VICODI Experience in Creating Visual Contextualization for History

Semantic Web applications in the humanities that visualize knowledge are still few and far between. The Visual Contextualization of Digital Content (VICODI) project brought together Semantic Web technologies with the concepts of contextualization and visualization of knowledge, an approach which we term visual contextualization. The goal was to enhance users’ understanding of digital content in...

متن کامل

SMORE – Semantic Markup, Ontology, and RDF Editor

The promise of the Semantic Web is founded on the principle that online content will be semantically annotated, creating machine-understandable content using interlinking ontologies. In keeping with this principle, we introduce SMORE, the Semantic Markup, Ontology, and RDF Editor. It provides users with an integrated environment for creating web pages, email, and other online content while faci...

متن کامل

Information Realisation: Textual, Graphical and Audial Representations of the Semantic Web Information Realisation: Textual, Graphical and Audial Representations of the Semantic Web

Information Realisation is the process of presenting data as Textual, Graphical or Audial information to a human user. In this paper, we discuss the importance of this concept with respect to the accessibility of Semantic Web data to a diverse target audience. We provide an ontological point of view, defining the expressive characteristics and application domain of representation formats, thus ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009